Factored MLLR Adaptation Algorithm for HMM-based Expressive TTS
نویسندگان
چکیده
One of the most popular approaches to parameter adaptation in hidden Markov model (HMM) based systems is the maximum likelihood linear regression (MLLR) technique. In our previous work, we proposed factored MLLR (FMLLR) where an MLLR parameter is defined as a function of a control parameter vector. We presented a method to train the FMLLR parameters based on a general framework of the expectation-maximization (EM) algorithm. To show the effectiveness, we applied the FMLLR to adapt the spectral envelope feature of the reading-style speech to those of the singing voice. In this paper, we apply the FMLLR to the HMM-based expressive speech synthesis task and compare its performance with conventional approaches. In a series of experimental results, the FMLLR shows better performance than conventional methods.
منابع مشابه
Factored Mllr Adaptation for Hmm-based Expressive Speech Synthesis
One of the most popular approaches to parameter adaptation in hidden Markov model (HMM) based systems is the maximum likelihood linear regression (MLLR) technique. In our previous work, we proposed factored MLLR (FMLLR) where MLLR parameter is defined as a function of a control parameter vector. We presented a method to train the FMLLR parameters based on a general framework of the expectationm...
متن کاملFactored MLLR Adaptation for Singing Voice Generation
In our previous study, we proposed factored MLLR (FMLLR) where each MLLR parameter is defined as a function of a control vector. We presented a method to train the FMLLR parameters based on a general framework of the expectationmaximization (EM) algorithm. In this paper, we extend the FMLLR structure from diagonal to unrestricted full matrix with a sophisticated algorithm for the training of re...
متن کاملAdaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
This paper describes a technique for synthesizing speech with an arbitrary speaker characteristics using speaker independent speech units, which we call “average voice” units. The technique is based on an HMM-based text-to-speech (TTS) system and MLLR adaptation algorithm. In the HMM-based TTS system, speech synthesis units are modeled by multi-space probability distribution (MSD) HMMs which ca...
متن کاملText-to-speech synthesis with arbitrary speaker's voice from average voice
This paper describes a technique for synthesizing speech with any desired voice. The technique is based on an HMM-based text-to-speech (TTS) system and MLLR adaptation algorithm. To generate speech of an arbitrarily given target speaker, speaker-independent speech units, i.e., average voice models, is adapted to the target speaker using MLLR framework. In addition to spectrum and pitch adaptati...
متن کاملFormant-based frequency warping for improving speaker adaptation in HMM TTS
Vocal Tract Length Normalization (VLTN), usually implemented as a frequency warping procedure (e.g. bilinear transformation), has been used successfully to adapt the spectral characteristics to a target speaker in speech recognition. In this study we exploit the same concept of frequency warping but concentrate explicitly on mapping the first four formant frequencies of 5 long vowels from sourc...
متن کامل